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Abstract. Consider A'^ equally-spaced points on a circle of circumference A'^. 
Choose at random n points out of A'^ on this circle and append clockwise an arc 
of integral length k to each such point. The resulting random set is made of 
a random number of connected components. Questions such as the evaluation 
of the probability of random covering and parking configurations, number and 
length of the gaps are addressed. They are the discrete versions of similar 
problems raised in the continuum. For each value of fc, asymptotic results 
are presented when n, A'^ both go to oo according to two different regimes. 
This model may equivalently be viewed as a random partitioning problem of 
N items into n recipients. A grand-canonical balls in boxes approach is also 
supplied, giving some insight into the multiplicities of the box filling amounts 
or spacings. The latter model is a fc— nearest neighbor random graph with A'^ 
vertices and kn edges. We shall also briefly consider the covering problem in 
the context of a random graph model with N vertices and n (out-degree 1) 
edges whose endpoints are no more bound to be neighbors. 

Running title: Bose-Einstein and Integer Partitioning 

Keywords: Random integer partition, random allocation, discrete cover- 
ing of the circle, discrete spacings, balls in boxes, Bose-Einstein, fc— nearest 
neighbor random graph. 



1. Introduction 

Many authors considered the problems related to the coverage of the unit circle by 
arcs of equal sizes randomly placed on the circle, among which [20], [19], [5], [6], 
[l8] . [2], [8], [To]. In this Note, motivated by a Remark in the paper ([2], p.l8) on 
random graphs, we shall be concerned by a discrete version to the above problem, 
following [9] and [14]: Consider N equally spaced points (vertices) on the circle 
of circumference N so with arc length 1 between consecutive points. Sample at 
random n out of these N points and consider the discrete random spacings between 
consecutive sampled points, turning clockwise on the circle. Let k be an integer 
and append clockwise an arc of length k to each sampled points, forming a random 
set of arcs on the circle. What is the probability that the circle is covered? If the 
circle is not covered, how many gaps do we have in the random set of arcs? What 
is the probability that no arc overlap (the discrete hard rods model), what is the 
probability that no arc overlap and that the gaps lengths are smaller than k itself 
(the discrete version of Renyi's parking model). All these questions require some 
understanding of both the smallest and largest spacings in the sample. This model 
can equivalently be formulated in terms of the random partitioning of N items 
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into n recipients. Here also, the distributions of the smallest and largest shares 
attached to each of the recipients are of fundamental interest. We will focus on the 
thermodynamical limit regime: n,iV — !■ oo while n/N — > p and also, sometimes, 



regime, the occurrence of say covering and parking configurations are exponentially 
rare in the whole admissible density range of p, whereas in the second one they are 
macroscopically frequent. At the heart of these models is the Bose-Einstein distri- 
bution for discrete spacings. Finally, a Bosonic grand canonical approach to the 
above model will be considered where N balls are assigned at random to N boxes. 
For this urn model, we will study the number of empty boxes and the number of 
boxes with i balls, giving some insight into the spacings multiplicities, both in the 
canonical and the grand-canonical ensembles. 

The model just developed is a fc— nearest neighbors random graph with N vertices 
and kn edges. In the last Section, we consider a random graph with N vertices and 
n (out-degree 1) edges whose endpoints are no more necessarily neighbors, being 
now chosen at random on the whole set of vertices. In this model of a different kind, 
each of the n sampled points is allowed to create a link far away with any of the 
N vertices, not necessarily with neighbors. We estimate the covering probability 
for this random graph model in the spirit of Erdos-Renyi (see [1]). We show that, 
in sharp contrast to the fc— nearest neighbor graph, there exists a critical density 
= 1 — above which covering occurs with probability one. The take- home 
message is to what extent when connections are not restricted to neighbors, the 
chance of connectedness is increased. 



2. Random partition of an integer and discrete spacings 

Consider a circle of circumference N, with N integer. Consider N equally spaced 
points on the circle so with arc length 1 between consecutive points. We shall call 
this discrete set of points the iV-circle. Draw at random n G {2, ..,N — 1} points 
without replacement at the integer sites of this circle (thus, with Mi, ..jMn inde- 
pendent and identically distributed, say iid, and uniform on {!,.., A'^}). Pick at 
random one the points Mi, ..,M„ and call it Mi:„. Next, consider the ordered set 
of integer points {Mm-.n, to = 1, .., n), turning clockwise on the circle, starting from 
Mi-n- Let Nm,n = Mm+i:n — Mm-.rn m = 1, ..,11 — 1, bc thc cousccutive discrete 
spacings, with Nn^n = Mi^n — M„:„, modulo N, closing the loop. Under our hy- 
pothesis, Nm,n = Nn, m — 1, ..,71, independent of m, the distribution of which is 



Fjv„ (fc) := P (^„ > fc) = 1 - Fn^ (fc) = C^;^7')/(^ri), with EA„ = N/n. 



It is indeed a result of considerable age (see e.g. [5]) that identically distributed 
(id) discrete spacings N„ := (7V„_„;to = l,..,n), with |N„| := I]m ^"i," = ^ ^^'^ 
be generated as the conditioning 



where |G„| := X]m=i is the sum of n iid geometric(Q;) random variables 
> 1 (with P (Gi > fc) = a^~^, k > 1, a e (0,1)) and so Nn has the claimed 




In the first 



(1) 



N„ = G„ |{|G„|=iV}, 
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Polya-Eggenberger PE{l,n-l) distribution: V {Nn ^ k) ^ ("rT-^ ^)/(^ri^) > ^ = 
l,..,iV-n+ 1. 

Note that as n, iV — >■ oo, while n/N = p < 1 is fixed, using Stirhng formula, we get 
the convergence in distribution 

(2) iV„ A G, 

where G > 1 is a discrete random variable (rv) with geometric(l — p) distribution: 
P (G > m) = (1 — , TO > 1. The limiting expected value of Nn is 1/p. 

With k :— (km', m — 1, .., n), the joint law of N„ is 

(3) P(N„ = k)-^^l(|k|=7V), 

which is the exchangeable uniform distribution on the restricted discrete TV— simplex 
|k| := X)m=i ^ni — N, km > 1, also known as the Bose-Einstein distribution. 
This distribution occurs in the following Polya-Eggenberger urn model context (see 
[TBj): An urn contains n balls all of different colors.. A ball is drawn at random 
and replaced together while adding another ball of the same color.. Repeating this 
N — n times, N„ is the number of balls of the different colors in the urn. See [9]. 

From the random model just defined, we get, 

n 

(4) ^ = II ^rn,n 

m—l 

which corresponds to a random partition of N into n id parts or components > 1. 



It also models the following random allocation problem (see [H]): N items are to 
be shared at random between n recipients. iVj„,n is the amount of the N items 
allocated to recipient to. Although all shares are id, there is a great variability in 
the recipients parts as it will become clear from the detailed study of the smallest 
and largest shares in the sample. 



This model is connected to the continuous spacings between n randomly placed 
points on the unit circle in the following way: As iV — > oo, N„/A^ A- S„ where 
S„ := {Si^m ■■,Sn,n) has Dirichlet uniform density function on the continuous unit 
n— simplex |17j 

(5) /si,..,S„ (si,..,s„) = (n- 1)! • 

Let Pn (1) := X]m=i 1 {^m,n > 1) bc the amount of sampled points whose distance 
to their clockwise neighbors is more than one unit. There are n — P„ (1) sampled 
points which are neighbors, therefore 

n 

N = 1 • (71 - P„ (1)) + ^ 7V,„,„1 (Af„,,„ > 1) 

m—l 

71 

= n+Y^ {Nm,n - 1) + 
m—l 
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where i+ = max (i, 0) . Appending an arc of length 1 clockwise to the n sam- 
pled points and considering the induced covered set from {1,..,A^}, £„ (1) :— 
Sm=i {^m,n — represents the length of the gaps (the size of the uncovered 
set). So, from the model £„ (1) = iV — n is a constant and 

n 
rn— 1 

corresponds to a random partition of — n into n id parts or components > 0. 
Stated differently, the length of the covered set £„ (1) = N — Cn{^) is constant 
equal to n, which is obvious. 



Of considerable interest is the sequence {Nm-.n', m = 1, ..,n) obtained while ordering 
the components sizes (iVm,n; m — I, .., n), with A^i:„ < .. < Nn-.n- 

By the exclusion-inclusion principle, the cumulative distribution function -FAr„.„ (fc) = 
P {Nm:n < k) is casily seen to be 

Vn — 1/ q—m ^ p—n—q ^ ^ / \ 

which has been known for a while in the context of spacings in the continuum (see 

m)- 

In particular, 

(7) ik) P (7V„:„ < fc) = 1 (-1)^ -P''-^ 

\n-l) p=0 

and 

(8) FN,..Ak)--=P{Ni...n>k)= ( ^ /' 



pj \ n — 1 



n — 1 J \n — 1 

are the largest and smallest component sizes distributions in this case. 

In the formula giving i^Af„.„ (fc) , with [a;] standing for the integral part of a;, the 
sum should as well stop at n A [ ^^" ] , observing (*) ~ if i < j. 

Clearly, if fc 1, P {Nn-.n 1) (= 1) whatever n < N {iin^ N). If /c = 2 and 

N >2n,P {Nn:n < 2) = P {Nn:n = 2) - 0. If iV = 2n, P {Nn:n = 2) = l/fc') 

is the probability of a regular configuration with all sampled points equally-spaced 
by two arc length units. If 71 < A < 2n, P {Nn-.n — 2) is the probability of a 
configuration with 2n — N neighbor points distant of one arc length unit and N — n 
points distant of two units. 

As A^, fc — ^ 00 while k/N ^ s 



n-l 



With < a < 6 < A^, the joint law of (A^i:„, Nn-.n) is given by 

{-l)"" fn\ /N - {na + m{b- a)) -1^ 



(9) P {Ni,n > a, Nn-.n < = 



Ar-1\ 
m— Vn— 1/ 



m / V n — 1 



A BOSE-EINSTEIN APPROACH TO THE RANDOM PARTITIONING OF AN INTEGER 5 



In the random partitioning of N image, it gives the probability that the shares of 
ah n recipients ah range between a and b. Putting (a — k,b — N) and (a = 0,b ^ k) 
gives i^jv„ „ (k) and Fj^-^,^ (k) . This formula was first obtained by 3 in the contin- 
uum. Putting next a = k, b = 2k, we get 

(10) P(Ar,^„>fc, iV_<2fc) = -_^(-l)" f 

When fc = 1, we have P (iVi,, > 1, N,,.,,, < 2) = (^ri')"'lw=2„. If = 2n, 

( ) is the probability of the configuration where the n sampled points are 
exactly equally-spaced, each by two arc length units. 

As n, — >■ oo while n/N — > p < 1, with E arv with rate 1 exponential distribution 

(11) - log (1 - p) n (iVi:„ ~l)^E; j^N^-.n 1, 

suggesting that the smaller (larger) integer component in the partition of N is of 
order (respectively logn) in the considered asymptotic regime. More precisely, 
using the joint law of (Ni-^n, Nn-.n) 

(12) log (1 - p) n (7Vi:„ - 1) , Nr,..n - ^) ^ [E, G) , 

where {E, G) are independent rvs on M_|- x R with distributions V [E > t) = e~* and 
V [G <t) — e^*^ ' with E (G) — 7, the Euler constant (exponential and Gumbel). 

Although in the random partitioning of N , all parts attributed to each recipient 
are id, there is a great variability in the shares as the smallest one is of order 1 and 
the largest one of order logn. 



3. A^— CIRCLE COVERING PROBLEMS 

Let Sn '■= {Ml, .., Mn} be the discrete set of points drawn at random on the 
iV— circle with circumference N. Fix k S {1,..,A^}. Consider the coarse-grained 
discrete random set of intervals 

(13) Sn (fc) := {Ml + I, .., Mn + l,l<l<k} 

appending clockwise an arc of integral length k > 1 to each starting-point atom of 

The number of gaps and the length of the covered set. Let P„ (fc) be the 
number of gaps of iS„ (fc) (which is also the number of connected components) , so 
with Pn (fc) = as soon as the iV— circle is covered by Sn (fc). 

Let also £„ (fc) be the total integral length of Sn (fc). As there are n — Pn (fc) 
spacings covered by fc and P„ (fc) gaps each contributing of fc to the covered length, 
it can be expressed as a contribution of two terms (i A j = min 

(14) £„ (fc) = J2 + ^-P" (^) = J2 (^".« ^ ^) ■ 

m—1 m—1 
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Note also that the vacancy, which is the length of the iV-circle not covered by any 
arc is 

P„(fc) n 

(15) £„ {k) ■.= N-Cn (fc) = J2 (Mn-p+l-.n " fc) = ^ {N^.n " fc)+ , 

p—1 m—l 

summing the gaps' lengths over the gaps (with Nn-.n ~ k the largest gaps size and 
Nn-p^(k)+i:n — k the Smallest gaps' size). We recover the result (i) originally due 
to [19j and its asymptotic consequences. The following statements are mainly due 
to Hoist, see [9]. It holds that 



(i) The distribution of P„ (fc) is 

Vn— 1/ m—p \ -I / \ / 

(m) ks n,N oo, while n(l — ■^)'^— J^a, 0<Q!<oo 

(17) P„(fc)^Poi(«), 

where Poi(a) is a random variable with Poisson distribution of parameter a. 
(in) 

a. Number of gaps. As n, — > oo while n/N p, with < p < 1, 

(18) ^ (P„ (fc) - n [n/Nf] A N (O,^^ = (l - p^')) • 

where N {rn, cr^) stands for the normal law with mean m and variance a^. 
h. Gap length: 

(19) ^(£„(fc)-7V(l-n/A^)'=) 4 AA(0,a2) 
yn V / jv-s-oo 

where cr^ = (l + p - p'') p'' - (p + kpf'f-^^^, 15=1- p. 



The proofs of (ii) and (iiifo) are in The one of (iiia) follows from similar 
Central Limit Theorem arguments developed there. In the first case (ii), n ~ 

N {l ~ ("^)^^'^) ^^"^ very close to N : because of that, there are finitely many 

gaps in the limit and the covering probability is e~", so macroscopic. Whereas in 
the second case (iw), n ~ pN is quite small: the number of gaps is of order np'^ 
and the covering probability is expected to be exponentially small. Note from (iiih) 
that the variance of the limiting normal law is when /c = 1, in accordance with 
the fact that (1) = N ~ n remains constant, o 



The number of arcs needed to cover the A^-circle. In (fTH]) . P (P„ (fc) = 0) is 
the cover probability and P (P„ (fc) = n) the probability that no overlap of arcs or 
rods takes place (the hard rods model). We have P (P„ (fc) = 0) = P [Nn-.n < k) . 

The cover probability P (P„ (fc) = 0) is also the probability that the number of arcs 
of length k (the sample size), say N (fc), required to cover the TV— circle is less or 
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equal than n. Wc have N (k) = inf {n : Nn-.n < k). In other words, P {N {k) > n) 



P {Nn:n > k) and so E7V (fc) = Y.n=l P {Nn:n > k) , with 

Vn-l/ m=l \ / \ 

We wish to estimate EiV (fc) as N grows large. 

When n (1 - f ^ a, so when n ^ N {l - (f )^^''), we have P (P„ (fc) = 0) 
P {N (fc) < n) e^". Therefore, as TV cx) 



(20) ivV^ (^1 - ^) 4 



where Ek has a Weibull(fc) distribution with P (Sfe > x) = e ^ and E(£'fe) = 
r (l + /c-i) . Thus 

(21) E7V (fc) ^^^^ nI^- ^^^^t/t + « (a^"'/')) 

is the estimated expected number of length-/c arcs required to cover the iV— circle. 



4. Large deviation rate functions in the thermodynamical limit: 
Hard rods, covering and parking configurations 

A:— Hard rods configurations are those for which A^i:„ > A: > 1 (the smallest part in 
the decomposition of N exceeds the arc-length k : appending an arc of length k to all 
sampled points does not result in overlapping of the added arcs), /c— Covering config- 
urations with A: > 1 are those for which Nn-n < k (the largest part in the decomposi- 
tion of N is smaller than arc- length k : appending an arc of length k to each sampled 
points results in the covering of all points of the iV-circle, a connectedness property). 
A:— Parking configurations are those for which both (A^i:„ > k and Nn-.n < 2A:) (the 
smallest part in the decomposition of N exceeds the arc-length k and the largest 
part in the decomposition of N is smaller than twice the arc-length k : appending 
an arc of length k to all sampled points results in a hard rods configuration where 
sampled points are separated by gaps of length at least k but with the extra excess 
gaps being smaller than k, so with no way to add a new rod (or car) with size k 
without provoking an overlap). All these configurations are exponentially rare in 
the thermodynamic limit n,N ^ oo while n/N p G (0,1). We make precise this 
statement by computing the large deviation rate functions in each case, extending 
to the discrete formulation similar results obtained in the continuum, see [llj . 



4.1. Hard rods. A:— hard rods configurations are those for which (iVi:„ > A: > 1) 
[In the partitioning approach of the fortune A'' amongst n recipients, this event is 
realized if the share of the poorest is bounded below by k, a rare event]. When the 
number of sampled points n is a fraction of N (the case with a density n = pN), 
there are too few sampled points for a non-overlapping configuration to occur with 
a reasonably large probability. Rather, one expects that the probability of non- 
overlapping (hard- rods) configurations tends to zero exponentially fast. To see 
this, we need to evaluate the large n expansion of P (A^i:n > A;). Note that the 
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event Ni;n > fc is an event with positive probability if and only if A'^ > n{k + 1) 
so, in the sequel, we shall assume that p < 1/ (fc + 1), fc > 1. We have 



P{Ni.,^> k) ^ — = = (1-^) 




where Zn,N = J2ki....k„>iIlm=i^k^>k^I2k^=N- In the limit n, ^ oo with 

n/N — > p, 

(22) -ilogP(A^i:„ > fc) ^ -i(plogp+(l-p)log(l-p))+ lim -ilogZ„,jv. 

n p rn-oo n 

In the limit n, — > oo with fixed n/N limit, the quantity P {Ni-^n > fc) is easier 
to evaluate in an isobaric ensemble where the pressure p is held fixed instead of 
^fcm,- Therefore, relaxing the constraint ^ fcm = -/V, we shall work instead with 
the modified random variables Nm,m with exponentially tilted law 

Here 

fci,..,fc„>l m=l \;>fe 

is the normalizing constant. 

Defining Gn,p -logZ„_p, we have dpGn,p = ^N,p {j2 ^rn,n'i^N„, „>k) ^'^'^ 
must choose p in such N, leading to N = n (k + I , 

i = i^J-p + fc + l,sop= — log (^ ^/Ipfc^^ ^ • The latter equation relating p, p and 
fc is an equation of state. Due to the equivalence of ensembles principle, see jl2! for 
similar arguments, we have: Zn,N = eP'^ Zn^pO (A^~^/^) , leading to: — ^ logZ^^^r ~ 
— i logZ„ p — ^. Proceeding in this way, we finally get 

(23) -ilogP(A^i,„>fc)^ 

n 

1 p /e-PC^+i) 

Fhr {p, p) ^ (plogp + (1 - p) log (1 - p)) log — 

p p \ 1 — e P 

with p G (0,l/(fc + l)). Here, thermodynamical "pressure" p> and density p 
are related through the "state equation" dpF^r {p, p) = which can consistently be 
checked to be 

1 e~P 

(24) - = fc + 1 



p 1-e-P 

leading to p = — log ^z^f'^) ^ (which is well-defined and positive because 
p < 1/ (fc + 1)). Thus Fhr is an explicit entropy-like positive function of p and fc, 
namely 

(25) Fhr (p) = 

((1 - p) log (1 - p) - (1 - p (fc + 1)) log (1 - p (fc + 1)) + (1 - pfc) log (1 - pfc)) 

P 
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In the thermodynamical limit, hard-rods configurations are exceptional and the 
hard-rods large deviation rate function Fhr is an explicit function of p and k. We 
conclude that with probability tending to 1, Ni^n = 1: In the partitioning approach 
of the fortune N amongst n recipients, the share of the poorer is the smallest 
possible. 

As p t 1/ + l)j pressure tends to oo and Fh,. (p) (k + l) log {k + 1) — fclog (fc) > 
0. As p 4- 0, pressure tends to and F^r {p) — >■ 0. 



4.2. Covering configurations. Covering configurations are those for which we 
have {Nn:n < fc) . In the partitioning approach of the fortune A'' amongst n recipi- 
ents, this event is realized when the share of the richest is bounded above by k (a 
rare event). Assume n,N ^ oo with n/N p £ {l/k,l) where fc > 1 is fixed. 
One also expects that the probability of covering configurations by arcs of length 
k tends to zero exponentially fast. Working now with 

1 - e-p'' 



e-P 



fei,..,fe„>l m=l ^ 

and proceeding as for the hard rods case, we easily get 

-- \0gP {Nr,..n < k) ^ F^{p,p) 

n 

where the covering large deviation rate function is 

1 T) ( e~P (l - e^P'') 

(26) Fe(p,p) = — (plogp+(l-p)log(l-p))-^-log' ^ ' 



p p \ 1 — e~P I 

Here, thermodynamical pressure p and density p G (1/A:, 1) are related through the 
covering state equation dpFc {p, p) = 0, namely 

1 , e-P ke-P'' 
(27) - = 1 + ■ 



p 1 - e-P 1 - e-P'' ' 

For all finite arc-length k, fc— covering configurations are also exceptional. The 
fc— covering large deviation rate function Fc is in general an implicit function of 
p and k, p € (l/fc,l). When p I 1/k, pressure tends to — oo and Fc{p) — >■ 
fclogfc — (fc — 1) log (fc — 1) > 0. As p t Ij pressure tends to oo and Fc (p) — >■ 0. 
By continuity, there is a value of pg inside the definition domain of p where p = 0. 
We have Fc{p, p^) = --^ {p„ log p^ + {I - p^) log {1 - Pq)) -log k. In the partition- 
ing approach of the fortune amongst n recipients, the share of the richest is 
bounded above with probability tending to exponentially fast. 

Remark: When fc = 2, the covering equation of state can be solved explicitly be- 
cause it boils down to a second degree equation in e~P. One finds p = — log ^2^;-:^^ • 
Plugging in this expression of p in Fc {p, p) with fc = 2 gives 

Fc = -- (2plogp - (2p - 1) log (2p - 1)) , 

P 

an explicit function of p G (1/2, 1) . Note that jjfcoasptl, pi —00 as p J, 1/2 
and p = when p = 2/3. We have Fc {p, 2/3) = | log 3 - 2 log 2. 
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4.3. Parking configurations. Parking configurations are those for whicfi we have 
{Ni;n > k, Nn:n < 2k) . In the partitioning approach of the fortune N amongst n 
recipients, this event is realized if the share of the richest is bounded above by twice 
the share of the poorest. Assume n,N ^ oo with n/N — ^ p € (1/ {2k) ,1/ (fc + 1)). 
One expects that the probabihty of A;— parking configurations tends to zero expo- 
nentially fast. Working now with 

fei,..,fc„>l m=l ^ 

and proceeding as for the hard rods case, we easily get 

- ^ log P (7Vi:„ > fc, A/-„:„ < 2k) ^ {p, p) 

where the parking large deviation rate function is 
(28) 

1 T) ( e-P'^^+^Ul - e-P'')\ 
i^.(p,p) = --(plogp+(l-p)log(l-p))-^-log ^ . 

P ^ \ J 

Here, thermodynamical pressure p and density p & {l/k,l) are related through the 
parking equation of state {p, p) = 0, namely 

(29) t=k+l + — 

The parking configurations large deviation rate function i^^ is an implicit func- 
tion of p and k with p G {1/ {2k) , 1/ (A; -|- 1)) . The latter formula can be extended 

to the border case k = 1. Indeed, when k = 1, p = 1/2, pressure tends to oo 
and -FV {p, p) — 2 log 2. From Stirling formula, this is in agreement with the fact 

P (A^i:„ > 1, iV„:„ < 2) = {^liy^ ^ only if A'' = 2n, which is the probability of 
the regular configuration where the n sampled points are all exactly equally-spaced 
by two arc length units. 



Remark: When k = 2, the parking equation of state can be solved explicitly to 

give p = — log ^ 4~^^ ^ • Plugging in this expression of p into {p, p) with k = 2 

gives as an explicit function of p € (1/4,1/3). Note that p t as p t 1/3, 
p i — oo as p 4- 1/4 and p = when p = 2/7. 



5. The grand canonical partition of N 



Suppose TV indistinguishable balls are assigned at random into N indistinguishable 
boxes. Let Nn,N > be the number of balls in box number n. This leads to a 
random partition of N now into N id summands which are > : 

JV 

(30) N=J2 

n=l 

We have 

(31) P (A"i,jv = fci, .., Nn,n = fcjv) = .2iv-ix , 

I JV ) 
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which is a Bose-Einstein distribution on the full A^— simplex: 



< fc„ > satisfying ^ kn — N > . 

[ n=l J 



Summing over all the fc„ but one, the marginal distribution of iVi jv is easily seen 
to be 



(2N-k-2\ 
N-k 



(32) P (7Vi,Ar = k)= '"Z' , fc = 0, .., N. 



f2N-l\ 
\ N ) 



Let Pjy — X]!y=i 1 i^n.N > 0) count the number of summands which are strictly 
positive (the number of non-empty boxes). With km > 1 satisfying X]m=i = 
we obtain 

(33) P (iVi^AT = fci, .., N^^N = fcn; PN = n) = -j^j^, 

which is independent of the filled box occupancies (fci, ..jkn) (the probability being 
uniform) . 

As there are (^Zi) sequences fc™ > 1, m = satisfying J2m=i^m ~ N, 

summing over the km > 1, we get the hypergeometric distribution for Pn ■ 

(34) p^p^ = n) = ^^0:^, n=l,..,N. 

This distribution occurs in the following urn model: Draw N balls without re- 
placement from an urn containing 2N — 1 balls in total, N of which are white, 
— 1 are black. The law of Pn describes the distribution of the number of white 
balls drawn from the urn. Its mean is N'^/ {2N — 1) ^ N/2 and its variance is 
(7V2 (TV - 1)) / (2 {2N - if^ - N/8. 

As a result, 

(35) P (iVi,Ar = fci, ..,Nn,N = fc„ I Pat = n) = -^1 (|k| = N) 

which is the spacings conditional Bose-Einstein model with k > 1 described in ^ . 
The balls in boxes model just defined is therefore an extension of the conditional 
Bose-Einstein model allowing the number of sampled points to be unknown and 
random. 



Repetitions (grand canonical). It is likely that some boxes contain the same 
number of particles. To take these multiplicities into account, let A^ jv, i G {0, .., N} 
count the number of boxes with exactly i balls, that is 

N 

(36) A,,jv =* {n € {1, .., N} : N,,^n = = I] 1 (^n,jv = i) ■ 

71=1 

Then X]iI=o^».^ ^ ^ where X^iLi ^i,^' ~ Pn is the number of filled boxes and 
Aq^n — N — Pn the number of empty ones. The joint probability of the Ai^N is 
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given by the Ewens formula (see [4] and [13j ) 

1 N\ 

(37) P {Aq.n = ao, Ai^N = ai, ■•, An,n = o,n) — ,2N~i\ 1^ — 



N 



niV 
^=0 a* 



on the set X^ilo = Sili = ^■ 



Let us now investigate the marginal law of the Ai ^r. Firstly, the law of Ao,n 
N — Pm clearly is 

Vao/ V ao / 
(2N-1\ 
\ N I 



(38) P {Ao,N = ao) = ^ «o = 0, .., N-l, 



with E (Ao,7v) N/2. Secondly, recalling A,^n = En=i 1 {Nn,N = i) , with (TV), := 
(iV — 1) .. (iV — / + 1), using the exchangeability of (A^i.at, .., Nn,n), the proba- 
bility generating function of A^jv {i ^ 0) reads 

E (z^-~) =1 + 5] {N)i P (7Vi,jv = .., Ni^N = . 
i>i 

Using P(iVi,^ = fci,..,7V,,Ar = fcO = f'^"^5l^"')/fj^ri'), we get the falling 
factorial moments of jv as 

(39) ,„,,.(») := E [(A,,„),l = (^"^-^;«; - 1 )• 

where Z S {0, .., Z (i) = {N — 1) A • The marginal distribution of Ai^^ is thus 



(40) P (A,,jv = a,) = ^ {__^)_^ f I \ ^^^^ ^ ^ ^ I ^^^1 



If ? = 1, E(A,,w) = V-V )/( iV-i )- The variance of A,^n is a"^ (A, 
TO2.» (^) + mi,^ (N) - TOi,, (A^)^ . 



In particular, we find that E (Ai^^r) = iV (/V + 1) / (2 (2iV + 1)) - 7V/4 is the mean 
number of singleton boxes in the grand canonical model: when N is large, about 
one fourth out of the N boxes is filled by singletons (recall that one half of N is 
filled by no ball). The variance of Ai^at is (Ai^jv) ^ A^/4 so we expect that ^i.at, 
properly normalized, converges to a normal distribution. Next, we can check that 
E(A2,Ar) ^ and further that E(yli.Ar) ^ A^/2'+-^, showing a geometric decay 
in i of E (Aj,jv) • 

Finally, note that the probability that A^.at takes its maximal possible value I (i) is 



P (A,,^ = / (Si) = m,(,),, (iV) // (*)! = (^^^^^ / 



2A^- 1 
N - 1 



For example P (Ai jy = — 1) = iV/ (^^_^ is the (exponentially small) probabil- 
ity that all N boxes are filled by singletons. 



Multiplicities and conditioning. Let us now investigate the same problem while 
conditioning on P/v ~ n. 
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Firstly, note that J2^=i i^i,N (i) = N is the total number of balls. Using the 
multinomial formula, with J2i=i ^'^i = ^ ^^'^ J2i=i = '^j thus get 



(41) P {Ai^N = ai,..,AN^N = aN,PN = n) = 



Q ^ 

2JV-1\ T-riV 



(2N-1\ T-rJV 

I N ) Ui=l O-i 

and 



(42) P(Ai_;v = ai,",-4iv,iv = ajv I -Pjv = n) = 



n! 



/N-l\ T-rAf I • 



The latter formulae give the joint (Ewens-like) distributions of the repetition vector 
count. 

Let us investigate the marginal distribution of the Ai^j^ conditional given Pjv = n. 
Firstly, the law of Ao,jv = A'' - Pjv is P (Ao,jv = oq | Pat = n) = ^^^.(jv-n). 

Secondly, recalling Aj^jv = Y^n=i {^n,N = i) , with (n); = n {n — 1) .. {n — I + 1) 
(and (n)^ := 1), using the exchangeability of {Ni^n, .., Nn^n), the conditional prob- 
ability generating function of Ai^N reads 

E(0^-^ \PN = n)=l + Y,^^^r-{n)i'P{Ni,N=i,..,Ni,N=i\PN = n). 

i>i 

Using P {Ni,N = fci, .., Ni,N = ki\PN=n) = Si-T^) I il-l) ^ S^t the con- 
ditional falling factorial moments of Ai^M as 

(43) (n, AT) := E [{A,,j,\ | = n] = {n\ (^I 

where I G {0, .., / (z) = (n — 1) A [(TV — 1) /i]} . The conditional marginal distribu- 
tion of Ai^N is thus 

(44) P {Ai^N = ai\PN = n) = Y^ (-l|j^ j^^Q ^^^^ ^ 

If ^ = 1, E (^i,jv I Pjv = n) = n('\-^-')/(^ri') . In particular, E (Ai,jv | Piv = n) = 
n (n — 1) / (AT — 1) is the mean number of singleton boxes. In the thermodynamical 
limit n, TV — > oo, n/N — > p, E (Ai^n \ P/v ~ n) ^ pn and a fraction p of the n filled 
boxes is filled with singletons. For the variance, we have cr^ (^i,Ar | P/v = n) ~ pn. 
We can also check that a fraction p (1 — p) of the n filled boxes is filled with double- 
tons: E {A2^n I P/v = n) ~ p (1 — p) n and more generally that E (Aj^jv | P/v = n) ~ 
p(l - p)'n. 

Finally, note that the probability that Ai^pf reaches its maximal possible value / (i) 
is 

f n \ fN - 1 

P {Ai^N =l{i)\PN=n) = mi^i),i (n, N) /I = I J / L _ i 

For example P (^i.at = n — 1 | P/v = n) = n/ (^Zi) is the probability that n — 
boxes are filled by singletons and one box by A'' — n -|- 1 balls, which is obvious. 
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6. Random Graph Connectivity 

The latter model may be viewed as a clockwise A;— nearest neighbor graph with N 
vertices and kn edges. Consider as before N equally spaced points (vertices) on 
the iV— circle so with arc length 1 between consecutive points. Draw at random 
n G {2, .., — 1} points without replacement at the integer vertices of this circle. 
Assume N <2n and draw an edge at random from each of the n sampled points, 
removing each sampled point once it has been paired. At the end of this process, we 
get a random graph with N vertices and n (out-degree 1) edges whose endpoints are 
no more neighbors, being now chosen at random on {1, .., A''}. We wish to estimate 
the covering probability for this new model in the spirit of Erdos-Renyi random 
graphs. 

Let Bm, m = l,..,n be a sequence of independent (but not id) Bernoulli rvs with 

success probabilities p„i = jy^m-i) ' = !> -i ^- With [2;*^] (j) (z) the 2;'^— coefficient 
of 4> {z) , the A^— covering probability is 

(n \ n 

N-n<Y,Bm<n\= ^ [2^=] E (^^-=1 ^'") , 
m=l / k=N—n 

which is just the probability to hit all points of the un-sampled set {n + 1, .., at 
least once in a uniform pairing without replacement of the n— sample. This covering 
probability is the probability of connectedness of the random graph with N vertices 
and n out-degree 1 edges. It is of course zero if TV > 2n. Let p„ = ^ Sm=i-P"i tie 
the sample mean of the Bernoulli rvs. The covering probability can be bounded by 

(46) P„,;v (cover) < Q(l-pj'pr^ 

k=N-n ^ ^ 

Assume n, iV — >• 00 while n/N — >• p so with p G (1/2, 1) . Then 
Vn (1 - P)=- 1^ (P) • 

Clearly (Bm) < 00 and J2m=i ^ (-Bm) has a finite limit. By Kolmogorov 
Strong Law of Large Numbers ^ Xlm=i -^to M (p) a-nd so Pjv (cover) ^ 1 if 
p > := 1 — because in this case the probability to estimate is 



n 

\. m=l 



with p, (p) e 



Whereas, when p < 1 — e ^, the bound for the covering probability can be estimated 
by 

P„,iv (cover) < AT ( j^^) {\-p{p)f^p {pf^<^-^^ dx 



f 

Ji-p 



CN / e^^^^^^da;, 
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where Hp [x) = plogp— xlogx— (p — x) log [p — x)+a;log {I — n {p))+{p — x) \ogp [p) ■ 
The function x — > Hp {x) is concave and attains its maximum sX x = p{\ — p{p)) < 
1 — p, which is outside the integration interval [1 — p, p]. By the saddle point method 

(47) liminf - - logP„,jv (cover) = Fa (p) := --Hp (1 - p) > 0. 

n,7V->oo, n/N-^p n p 

So only in the low-density range ^ <p< 1 — e^^is the graph's connectedness prob- 
ability exponentially small. Note that the graph large deviation rate function Fq is 
maximal (minimal) aX p — 1/2 (p^ ~ 1 — e^^), with Fq (Pc) = log (e — 2) > 0. 

We conclude that in the random graph approach to the covering problem, in sharp 
contrast to the fc— nearest neighbor graph, there exists a critical density p^ = 1 — 
above which covering occurs with probability one. These results illustrate 
to what extent, when connections are not restricted to neighbors, the chance of 
connectedness is increased. This question was also raised in (P], p. 18) in relation 
with Small- World graphs. 
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